The Journal of Neuroscience
● Society for Neuroscience
Preprints posted in the last 30 days, ranked by how well they match The Journal of Neuroscience's content profile, based on 928 papers previously published here. The average preprint has a 0.47% match score for this journal, so anything above that is already an above-average fit.
Dong, C.; Wang, Z.; Zuo, X.; Wang, S.
Show abstract
Interpersonal communication relies on integrating facial and vocal signals to extract multidimensional communicative information. How the absence of audition reshapes the communicative system remains unclear. We compared the performance of deaf (N=136) and hearing (N=135) adults across multiple domains, facial identity, emotional expression, speech, and global motion, through a series of unisensory and audiovisual psychophysical tasks. The results showed that, in hearing individuals, reliance on facial versus vocal signals differed across domains. In deaf individuals, auditory deprivation did not produce uniform enhancement or impairment of visual processing. Instead, they exhibited reduced sensitivity to dynamic emotional expressions and global motion, preserved sensitivity to facial identity (both static and dynamic) and static expressions, and enhanced categorization of facial speech. Notably, sensitivity to dynamic facial expressions and global motion was correlated, and both were explained by variations in fluid intelligence. Our results provide a systematic characterization of visual function across domains in deaf individuals, suggesting that the consequences of hearing loss are shaped both by the functional roles of audition within each domain and by broader cognitive adaptations. These findings advance understanding of cross-modal plasticity and inform the development of targeted ecologically valid accessibility and sensory-substitution strategies.
Hebisch, J.; Van Puyenbroeck, P.; Schwabe, L.; de Gee, J. W.; Donner, T. H.
Show abstract
Brainstem arousal systems including the locus coeruleus noradrenaline system, re-spond transiently to behaviorally relevant events. Locus coeruleus activity also drives dilations of the pupil, which are often observed during cognitive tasks. The strength of pupil responses during encoding of stimulus material predicts the success of its later retrieval, which might reflect the impact of noradrenaline on synaptic plasticity and memory formation. The pupil also dilates in response to task-irrelevant sounds, which could therefore serve as a valuable tool for investigating causal effects of phasic, pupil-linked arousal on cognition. Here, we evaluated whether task-irrelevant white noise sounds affect memory formation and memory-based decisions. These sounds were played before, during or after the presentation of memoranda (images or spoken words). Memory success was measured in recognition and free recall tasks the day after. Trial-to-trial variations in the amplitude of pupil dilations during word encoding without task-irrelevant sounds predicted memory success. Task-irrelevant white-noise sounds also robustly dilated the pupil but did not improve memory formation for the words or the images. We conclude that pupil-linked arousal processes triggered by task-irrelevant sounds differ from those recruited endogenously during memory for-mation, for example in states of increased emotionality or attention.
Manyukhina, V.; Barlaam, F.; Vergne, J.; Bain, A.; Abdoun, O.; Daligault, S.; Delpuech, C.; Jerbi, K.; Sonie, S.; Bonnefond, M.; Schmitz, C.
Show abstract
To compensate for self-generated movement-induced postural disturbances, the brain generates anticipatory postural adjustments (APA), ensuring smooth, coordinated actions. APA development continues into late adolescence, yet the specific pathways and mechanisms that remain immature in children are poorly understood. We studied APA mechanisms in 24 children (7-12 years old) using magnetoencephalography (MEG) while they performed the naturalistic bimanual load-lifting task (BLLT). In the BLLT, participants lift a load placed on one forearm with the contralateral hand while keeping the postural forearm horizontal, as if lifting a glass from a tray. To counteract forearm deflection caused by unloading, the brain generates APAs, which involve anticipatory inhibition of the postural Biceps brachii. We found that stronger anticipatory Biceps brachii inhibition was associated with reduced excitability, as indexed by high-gamma (90-130 Hz) suppression, and increased high-beta power (19-29 Hz) in the contralateral Supplementary Motor Area (SMA). Analysis of transient beta events revealed two functionally distinct burst types: (1) 19-24 Hz bursts: time-locked to immediate high-gamma suppression correlated with 26-28 Hz beta power; predicted stronger muscle inhibition and received directed input from middle frontal cortex and precentral gyrus; (2) 24-29 Hz bursts: linked to delayed ([~]100 ms) high-gamma suppression correlated with 8 Hz alpha power; predicted earlier and prolonged muscle inhibition and better forearm stabilization, but did not show directional influence from other regions. Results on anticipatory inhibition-related beta bursts replicated mechanisms reported in adults, suggesting that the efferent pathways and transient inhibitory processes underlying APA may already be mature in children. In contrast, higher-frequency beta bursts revealed a child-specific, complementary APA mechanism that may compensate for imprecise anticipatory inhibition. These results reveal two oscillatory mechanisms supporting APA in children and indicate that beta bursts may reflect both immediate cortical inhibition linked to muscle control and indirect alpha-mediated inhibition likely compensating for forearm instability.
Xue, A. M.; Hsu, S.; LaRocque, K. F.; Raccah, O. M.; Gonzalez, A.; Parvizi, J.; Wagner, A. D.
Show abstract
Episodic memory depends on neural representations encoded in the hippocampus. Experimental and computational evidence suggests that the hippocampus encodes pattern-separated representations that support later recall of episodic event elements. While extant data in humans predominantly focus on assaying the relationship between the similarity of spatial neural patterns at encoding and later memory performance, similarity of neural patterns in the temporal domain may also reveal encoding computations predictive of future memory. To examine how the similarity among temporal patterns of hippocampal activity during encoding relates to later episodic retrieval (associative cued recall and recognition memory), hippocampal activity was recorded from human participants (n=7) with implanted intracranial electrodes while they encoded arbitrary (A-B) paired-associates. Subsequent memory analyses first revealed that hippocampal high-frequency broadband power (HFB; 70-180Hz) was linked to a graded increase in memory strength; HFB power was greater during the encoding of pairs later correctly recalled relative to events later recognized and was lowest for events later forgotten. Second, and critically, subsequent memory analyses further revealed that more distinctive temporal patterns in the hippocampus during encoding -- indexed by the similarity of the HFB timeseries elicited by a given event to that elicited by other events -- were associated with superior subsequent memory performance. Finally, exploratory analyses revealed stimulus category effects on hippocampal HFB power during encoding and retrieval cuing. These results indicate that the temporal distinctiveness of hippocampal traces during encoding is important for subsequent retrieval of episodic event elements, consistent with theories that posit that pattern separation facilitates future remembering.
Owoc, M. S.; Lee, J.; Johnson, A.; Kandler, K.; Sadagopan, S.
Show abstract
The inferior colliculus (IC) integrates ascending auditory and descending multimodal inputs within distinct subdivisions, the central nucleus (CNIC) and cortex (CtxIC). Despite differences in connectivity, auditory responses in these subdivisions are similar, complicating localization during in-vivo recordings. Here, we tested whether recordings can be assigned to CNIC or CtxIC using only response properties in awake and anesthetized mice. We constructed frequency response areas (FRAs) from pure tone responses and extracted tuning and firing metrics. Individual FRA features could not reliably localize recordings. In contrast, a random forest classifier combining FRA-derived features accurately localized recordings to CNIC or CtxIC across states. These findings demonstrate that while IC subdivisions differ only subtly along individual response parameters, appropriate multiparametric approaches can enable robust classification. More broadly, our results illustrate how biologically meaningful distinctions may be revealed by combining weakly informative features, an approach that can be applied across diverse brain regions and modalities.
Martorell, J.; Di Liberto, G.; Molinaro, N.; Meyer, L.
Show abstract
Speech comprehension involves the inference of abstract information from continuous acoustic signals. Prior work suggests that electrophysiological activity is synchronized with abstract linguistic structures (phrases and sentences) during the processing of isochronous syllable sequences. It is yet unclear whether this prior evidence generalizes to natural speech comprehension, which requires the flexible processing of continuous speech, where syllables and other types of linguistic units are anisochronous. Our magnetoencephalography experiment investigated neural synchronization to acoustic (syllables) and abstract units (phrases and sentences) using continuous speech ranging from artificial isochronous to more natural anisochronous. We find that neural synchronization to phrases and sentences, but not syllables, is resilient to naturalistic anisochrony. This suggests that linguistic structure processing reflects endogenous inferences that are fundamentally distinct from the exogenous processing of syllables driven by speech acoustics. Lateralization and linear regression results extend this functional dissociation as hemispheric asymmetry: stimulus-independent leftward lateralization for linguistic structure processing but stimulus-driven rightward lateralization (or bilaterality) for both syllable and acoustic processing. Our findings provide a more realistic characterization of the flexible neural mechanisms supporting the efficient comprehension of natural speech.
Comas, V.; Pouso, P.; Borde, M.
Show abstract
Gymnotiform fish emit electric organ discharges (EODs) for both active electroreception and electrocommunication. EOD waveform and rhythm can be modified to cope with diverse environmental challenges. In pulse-type species, EODs are generated by a hierarchical electromotor network controlled by a medullary pacemaker nucleus (PN), which comprises intrinsic pacemaker cells (PM-cells) and projecting relay cells (R-cells). Active electroreception requires the emission of stereotyped EODs, an electromotor output that implies a functional PN configuration in which PM-cells rhythmically time EODs and R-cells transmit coordinated commands to downstream components of the electromotor system. To test whether electrical coupling (EC) between PN neurons supports this functional organization, intrinsic connectivity of the PN in Gymnotus omarorum was examined in brainstem slices using electrophysiology, immunohistochemistry, and dye-coupling analysis. Homotypic connections (PM-PM and R-R) exhibited low-magnitude, bidirectional EC with symmetrical, low-pass filter properties, supporting synchronous yet adaptable pacemaker activity and coordinated descending commands. Heterotypic connections (PM-R) also displayed bidirectional, symmetrical coupling but revealed direction-dependent filtering: an apparent high-pass behavior from PM- to R-cells and a low-pass behavior in the opposite direction. Together with precise PM-to-R discharge timing, direction-dependent filtering suggests a role of PM-cell axons in shaping signal flow. Dye coupling and immunohistochemical evidence further indicate that PN neurons are interconnected via gap junctions, likely formed by connexin 35. Thus, EC-based connectivity endows the PN with crucial functional attributes of its exploration mode of operation while preserving the capacity to organize communication signals under the influence of descending inputs, revealing remarkable functional versatility. Summary statementGap junction-mediated intrinsic connections within the electromotor nucleus of electric fish may sustain the emission of signals essential for sensory sampling as well as those supporting communication.
King, C. D.; Groh, J. M.
Show abstract
Eye movement-related eardrum oscillations (EMREOs) appear to consist of a pulse of oscillation occurring in conjunction with saccades. However, this apparent pulse could occur either because there is an increase in energy at that frequency at the time of saccades (a true pulse), or because there is saccade-related phase resetting of ongoing energy at that frequency band, thus appearing like a pulse when averaged in the time domain across many trials. Here we conducted a spectral analysis at the individual trial level in humans performing a visually guided saccade task to determine whether the power at the EMREO frequency (30-45 Hz) is higher during saccades than during steady fixation. We found both an increase in sound power in the EMREO frequency band associated with saccades, i.e. sound pulses at the individual trial level, as well as, phase resetting at saccade onset/offset. While both factors contribute to the apparently pulse-like EMREO signal, phase resetting appears to be more prevalent across participants. The prevalence of phase resetting has implications for the underlying mechanism(s) producing EMREOs as well as functional consequences for how the ear might respond to incoming sound in an eye-position dependent fashion.
Bair, M. B.; Long, N. M.
Show abstract
It is critical to identify which factors induce specific brain states as these large-scale patterns of coordinated neural activity drive downstream processing and behavior. The retrieval state, a brain state engaged when attempting to retrieve the past, is thought to specifically support episodic memory, remembering experiences within a spatiotemporal context, as opposed to semantic memory, remembering general knowledge. However, we hypothesize that the retrieval state reflects internal attention engaged to access stored episodic and semantic information. To test these alternatives, we recorded scalp electroencephalography while participants made episodic, semantic, or perceptual judgments, and applied an independently validated mnemonic state classifier to measure retrieval state engagement. We found that retrieval state engagement was greater for both episodic and semantic judgments compared to perceptual judgments. These findings suggest that the retrieval state reflects a domain-general internal attention process that supports not just episodic memory, but internally directed cognition.
Marrazzo, G.; Pimpini, L.; Kochs, S.; De Martino, F.; Valente, G.; Roefs, A.
Show abstract
Despite substantial progress in understanding how visual features of food are processed in the brain, it remains unclear how subjective and nutritional properties, such as perceived palatability, caloric content, and health value, are reflected in neural representational structure. Using functional MRI and representational similarity analysis (RSA), we examined how visual, subjective, and nutritional food properties are encoded in ventral visual cortex. Univariate analyses revealed reliable activation differences between high- and low-calorie foods in lateral occipitotemporal cortex (LOTC) and fusiform gyrus. RSA further revealed a functional dissociation within the ventral stream: LOTC showed systematic correspondence with both visual and subjective dimensions, whereas fusiform cortex exhibited a selective association with perceived caloric content, with both effects persisting after controlling for visual similarity. These results suggest that food-related dimensions not fully captured by the tested visual models are reflected within visual representational spaces, and that LOTC and fusiform cortex show dissociable representational profiles with respect to subjective and perceived nutritional food dimensions.
Zhu, J.; Smith, C. R.; Garin, C. M.; Zhou, X. M.; Calabro, F.; Luna, B.; Constantinidis, C.
Show abstract
Response inhibition is a critical cognitive process that is not fully mature at the time of puberty but continues to improve during adolescence. To understand the neural basis of the maturation process, we obtained longitudinal behavioral, neurophysiological, and imaging data in macaque monkeys as they aged through adolescence. Behavioral performance in several variants of the antisaccade task improved markedly through this period. Neural activity in the prefrontal cortex generally increased, particularly when synchronized to the saccade generation. Trajectories of neural activity and cognitive performance were well predicted by maturation of long-distance white matter tracts connecting the frontal lobe with other brain areas. Our results link the maturation of response inhibition and prefrontal neural activity changes to white matter maturation.
Eustace, S. D.; Guediche, S.; Brasiello, L.; Rocha, M.; Correia, J. M.
Show abstract
Speech production requires orchestration of multiple brain systems, including cortical and subcortical areas that support the unfolding of the spoken message across hierarchical linguistic levels, such as phonemes, syllables, words or phrases. Transitions between levels are critical for fluent speech, yet the neural dynamics of, for example, syllable-level and word-level transitions remain unknown. In this electroencephalography (EEG) study, we use time-frequency analysis and source localization to determine differences associated with word-boundary vs. within-word syllable transitions. To this end, pseudoword pairs comprising six consonant-vowel (CV) syllables with different word-boundary positions were used. Fluent human adults produced the utterances at the rhythm of a learned visual metronome (i.e., syllable-by-syllable), such that each syllable was uttered at matching times independently of its relative word position. Accordingly, a target syllable could be either a within-word syllabic transition or a between-word transition, while other linguistic properties, including articulation, stress pattern, co-articulation or prosody, were matched. EEG time-frequency analyses of neural sources successfully revealed sensitivity to hierarchical structure. Neural sources in left and right inferior frontal lobes, as well as left superior temporal lobe were differentially recruited when producing the same exact syllables, in the same exact utterance position, but under different word boundary contexts. A right inferior frontal source showed a robust time-frequency modulation in word transitions that included elevated event-related synchronization in the theta and beta range. Interestingly, despite our efforts to control speech pace across conditions using metronome-based guidance, small, albeit significant timing delays emerged, confirming higher cognitive demands at word boundaries.
Hetsch, F.; Santini, I.; Buetfering, C.; Ruggieri, S.; Jacobi, E.; von Engelhardt, J.
Show abstract
Relay neurons of the dorsal lateral geniculate nucleus (dLGN) receive convergent inputs from retinal ganglion cells (RGCs). Retinogeniculate synapses exhibit a highly skewed distribution of synaptic strength, with a few strong inputs and many weak ones. Strong synapses are thought to dominate relay neuron activity. However, the contribution of individual inputs might not just depend on strength but also on short-term plasticity Using minimal stimulation recordings in acute mouse brain slices, we analyzed the electrophysiological properties of individual retinogeniculate synapses. We observed a robust inverse correlation between synaptic strength and short-term plasticity: weak synapses showed facilitation, whereas strong synapses exhibited pronounced depression. This was consistent with increasing vesicle release probability and enhanced AMPA receptor desensitization at stronger synapses. Analysis of synaptic current kinetics further suggested that variability in synaptic strength reflects not only differences in synapse size and AMPA receptor content but also differences in the electrotonic distance of synapses from the soma. Together, these results reveal systematic heterogeneity in both presynaptic and postsynaptic properties of retinogeniculate synapses. Therefore, the relative contribution of weak and strong inputs to relay neuron firing is likely activity-dependent, with strong synapses dominating when RGCs fire few action potentials and weaker inputs contributing more during sustained or high-frequency firing with several action potentials.
Kalidindi, H. T.; Crevecoeur, F.
Show abstract
Successful goal-directed movements depend on the central nervous systems (CNS) ability to handle diverse physical interactions. The CNS is thought to handle different dynamical contexts through three mechanisms: (i) trial-by-trial adaptation when forces are predictable, (ii) a model-free robust control strategy, and (iii) online adaptation of feedback responses. While each has been studied independently, their relative contributions and the possibility that they are recruited to different extents across contexts is unknown. Here, we quantified all three strategies within the same individuals to examine how CNS exploits them under varying environmental conditions. Participants (19 female, 15 male) performed reaching tasks while interacting with robot-generated force-fields that were either consistent or varied unpredictably. Trial-by-trial adaptation was measured using standard force channels to isolate anticipatory compensation. Robust control was assessed through movement velocity and corrective force magnitude. Online adaptive control was quantified by the temporal alignment between commanded and measured forces within a movement. Results showed that participants improved anticipatory compensation in consistent environments and relied on both robust and online adaptation when perturbations were unpredictable. Crucially, markers of robust control dominated the early movement phase, whereas online adaptation dominated later corrections. This temporal dissociation was confirmed by electromyographic recordings. Markers of robust and online adaptive feedback strategies also statistically predicted participants ability to adapt across trials in consistent environments, revealing a common trait linking online control and adaptation. These findings reveal a rich and flexible combination of control mechanisms, offering a new framework for understanding the neurophysiological bases of reaching control. Significance StatementHuman reaching control is a complex behavior resulting from several mechanisms that orchestrate feedback responses to mechanical perturbations and adaptation to changes in the environment. Here we combine previously studied paradigms to highlight within the same groups of healthy volunteers that three major components are recruited to different extents dependent on the context: unpredictable environment promote concomitant use of robust control and online adaptation whereas predictable environments recruit standard adaptation based on anticipatory compensation. Remarkably, individuals adaptive capabilities correlated across consistent and inconsistent environments, suggesting a key involvement of adaptive mechanisms in both online control and trial-by-trial adaptation. Robust control, online adaptation, and anticipatory compensation are dissociable behaviorally, and are used to varying levels as a result of individual traits.
Ziobro, P.; Zheng, D.-J.; Rawal, A.; Zhou, Z.; Mittal, A.; Tschida, K. A.
Show abstract
Animals produce different vocalization types, which differ in their acoustic features and are produced in different behavioral contexts. How vocalization-related brain circuits are organized to enable the production of different vocalization types remains poorly understood. The nucleus retroambiguus is a hindbrain premotor region that regulates the production of both ultrasonic vocalizations (USVs) and distress calls (squeaks) in adult mice, but whether distinct or overlapping populations of RAm neurons are recruited during the production of these two vocalization types is unknown. In the current study, we used Fos immunohistochemistry to compare the counts and spatial distributions of Fos-positive RAm neurons in males and females that produced USVs and females that produced courtship squeaks. We also combined in vivo activity-dependent (TRAP2) labeling with Fos immunohistochemistry to directly compare Fos expression associated with the production of USVs and courtship squeaks in the same females. Our findings suggest that RAm contains three vocalization-related populations of neurons: squeak-related neurons, USV-related neurons, and shared neurons that are recruited during both vocalization types. These findings refine current models of the premotor control of vocalization and set the stage for future work to explore anatomical and functional heterogeneity within RAm.
Nazemorroaya, A.; Batten, S.; Grunfeld, I.; Torres, A.; Celaya, X.; Moreland, O.; Lattuca, C.; Wagle, A.; Nikjou, D.; Barbosa, L. S.; Lohrenz, T.; Chiu, P.; Brewer, G. A.; McClure, S.; Witcher, M. R.; Bina, R. W.; Montague, P. R.; Dayan, P.; Bang, D.
Show abstract
Dopamine is believed to modulate not only instrumental learning about the link between states, actions, and outcomes but also reflexive behaviours, such as a Pavlovian bias to approach in rewarding states and freeze in aversive ones. We studied these dual roles in the human brain, by combining intracranial dopamine recordings from the anterior cingulate cortex (ACC)-- a region implicated in behavioural and cognitive control -- with a motivational Go/NoGo task involving conflict between instrumental and Pavlovian action selection. We found evidence that dopamine in the ACC is involved in evaluating whether Pavlovian responding should guide behaviour. This computational motif was observed across multiple task events, including in response to rewards and punishments, and in analyses based on a reinforcement learning model. Our results indicate that dopamine supports learning at the more abstract level of behavioural policies in addition to the more concrete levels of states and actions.
Souffi, S.; Nelken, I.
Show abstract
The ventral tegmental area (VTA) is a key region in the reward system of the vertebrate brain, yet its role in sensory processing remains largely unexplored. Here, we use fiber photometry in awake freely-moving mice to investigate how the VTA represents auditory stimuli. We compared VTA responses to those in the inferior colliculus (IC) recorded with the same technique. Neural activity in the VTA exhibited robust responses to a wide variety of auditory stimuli, including broadband noise, pure tones, and music stimuli. We identified a subset of short-latency trials, comparable to those observed in the IC, with response durations that were longer than in the IC. VTA responses to complex sounds showed minimal envelope tracking and low temporal reproducibility, resulting in poor discriminability. This study positions the VTA as a potentially active player in shaping the perception of sound.
Miller, D. J.; Kaas, J. H.
Show abstract
Visual acuity spans more than a 100-fold range in mammals and yet the neural correlates of this perceptual gradient has not been fully evaluated. Furthermore, even though the known derived features of the human brain include specific changes to the cerebral cortex and the visual system in particular, no evolution of developing life histories approach has been quantitatively applied to metrics of cell composition. In this study, we present stereological estimates of neuron and glia density in V1 and granular layer 4 in a comparative sample of primates. We then integrate these data with the literature to construct a larger comparative dataset to test for phylogenetic relationships in mammalian visual system organization. Our examination revealed a primary relationship between acuity and neuron number along with secondary relationships among cell types tied to metabolic maintenance and support across the lifespan. Retinal metabolics along with phylogenetic position accounts for a large amount of V1 neuron density, which are further related to acuity while glia are related to longevity. These results identify a dissociation in the evolutionary developmental organization and senescence of V1 that map onto first principal parameters to explore the phylogenetic position and ecological pressures acting on the mammalian visual system. In accord with the literature, humans are revealed as outliers in glial support of neuronal metabolism across the lifespan. These findings provide evidence that mammalian visual cortex varies along at least two partially separable cellular dimensions in which visual resolution differs from lifelong maintenance. Summary and SignificanceIf the cellular organization of visual cortex is shaped by a single constraint or multiple independent pressures is debated. We used stereology to test whether V1 neurons and glia make separable contributions to visual performance across the mammal lifespan. V1 neuron density, together with brain size, predicts visual acuity. The glia-to-neuron ratio is alternatively associated with maximum lifespan after controlling for neuron density. Humans and chimpanzees have nearly identical V1 neuron densities, yet humans appear show substantially elevated glial investment across the literature. These findings suggest that mammalian visual cortex evolves under at least two partially separable pressures with neuron density and spatial resolution on one hand and glia investment to sustain function across life on the other.
Chen, W.; Pell, M.; Jiang, X.
Show abstract
People encounter AI voices daily. Existing behavioral studies suggest listeners rely on prosodic cues such as intonation and expressiveness to detect audio deepfakes, reporting that AI voices sound prosodically less rich than human voices. To test whether prosodic processing drives deepfake discrimination in the neural time course of voice processing, we recorded electroencephalographic (EEG) data while participants listened to human and AI-generated speakers producing utterances in confident vs. doubtful prosody (tone of voice), with attention directed toward memorizing speaker names. We used voice cloning to control for speaker identity confounds between human and AI voices. Multivariate pattern analysis revealed that neural discrimination of human vs. AI voices emerged rapidly regardless of prosody (confident: 176 ms; doubtful: 134 ms), substantially preceding prosody discrimination (confident vs. doubtful within human voices: 2066 ms; within AI voices: 1366 ms). Acoustic analysis confirmed that prosodic distinctions became classifiable only at utterance offset (90% normalized duration), converging with neural evidence that prosody requires near-complete temporal integration. This temporal dissociation between rapid voice source discrimination and late-emerging prosody decoding suggests that prosody plays a smaller role in audio deepfake detection than listeners retrospectively report. Representational similarity analysis further revealed that spectral envelope features (mel-frequency cepstral coefficients; MFCC), rather than the visually salient high-frequency energy differences, drove neural human-AI discrimination, with MFCCs earliest independent contribution (228 ms) closely following the MVPA decoding onset (134-176 ms). Future studies may manipulate specific acoustic components to establish the causal sources of this rapid and sustained neural discrimination. Significance StatementPeople encounter AI voices daily, in phone calls, navigation apps, supermarket checkouts, and subway announcements. Using electroencephalography, we show that the human brain automatically and rapidly distinguishes everyday AI voices from human speech, even without conscious attention to voice source. Although people may attribute this ability to AI voices sounding monotone or prosodically unnatural, the brain relies on subtler acoustic signatures, enabling discrimination before prosodic information becomes available. Attempts to identify the specific acoustic features driving this neural detection were inconclusive, pointing to the need for future causal investigations. We encourage engineers and policymakers to ensure AI voices remain perceptually detectable, as increasingly humanlike AI voices could cognitively disadvantage the general public if they become indistinguishable from human speech.
White, D. N.; Kushner, J. K.; Winther, K. E.; McGovern, D. J.; Basta, T.; Hoeffer, C. A.; Donaldson, Z. R.; David H. Root, D.; Stowell, M. H. B.
Show abstract
Neurotransmitter co-transmission contributes to diverse physiological processes throughout the mammalian brain, including sensory integration, motivational control, and social behaviors. Projections from the globus pallidus internus (GPi; the entopeduncular nucleus, EPN, in rodents) to the lateral habenula (LHb) are well-characterized by the co-transmission of both GABA and glutamate. These dual-release inputs modulate behavioral states in chronically learned helpless (cLH) rats, influencing both the onset and recovery of pathological phenotypes. Here, we employed confocal 3D reconstructions that confirmed the presence of both vesicular transporters VGAT and VGLUT2 in EPN axon terminals within the LHb. Further investigation revealed that GABA and glutamate are packaged in distinct vesicle populations within individual presynaptic terminals. Notably, the calcium (Ca{superscript 2}) sensors Synaptotagmin-2 (Syt2) and Synaptotagmin-3 (Syt3) are highly expressed in the EPN whereas expression of the canonical Ca{superscript 2} sensor, Synaptotagmin 1 (Syt1), is downregulated. Additionally, using confocal microscopy, we observed selective spatial correlations of Syt2 and VGLUT2 and between Syt3 and VGAT in LHb axon terminals. These observations strongly suggested that Syt2 serves as the predominant Ca{superscript 2} sensor for glutamatergic vesicle fusion, and Syt3 serves as the predominant Ca{superscript 2} sensor for GABAergic vesicle fusion in the LHb. To test this hypothesis, we employed targeted antisense oligonucleotide (ASO) knockdown of Syt2 and Syt3 in EPN neurons and measured LHb glutamatergic and GABAergic currents. Syt2 knockdown resulted in an increase in mEPSC frequency, amplitude, half-width and decay, suggesting increased glutamate vesicle release probability and increased glutamate vesicle packing. However, Syt2 knockdown had no influence on mIPSCs amplitude or frequency. On the other hand, Syt3 knockdown had no apparent effect on glutamate release but caused an increase in mIPSC frequency suggesting increased quantal release probability of GABA. Together, these findings identify a molecular mechanism by which synaptotagmin isoforms govern differential glutamate and GABA release at EPN dual-transmitter terminals in the LHb. These results provide evidence for presynaptic mechanisms regulating excitatory-inhibitory balance within this brain structure and importantly provide molecular targets for pharmacological intervention.